Accurate and compact large vocabulary speech recognition on mobile devices

نویسندگان

  • Xin Lei
  • Andrew W. Senior
  • Alexander Gruenstein
  • Jeffrey S. Sorensen
چکیده

In this paper we describe the development of an accurate, smallfootprint, large vocabulary speech recognizer for mobile devices. To achieve the best recognition accuracy, state-of-the-art deep neural networks (DNNs) are adopted as acoustic models. A variety of speedup techniques for DNN score computation are used to enable real-time operation on mobile devices. To reduce the memory and disk usage, on-the-fly language model (LM) rescoring is performed with a compressed n-gram LM. We were able to build an accurate and compact system that runs well below real-time on a Nexus 4 Android phone.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient codebook for fast and accurate low resource ASR systems

Nowadays, speech interfaces have become widely employed in mobile devices, thus recognition speed and power consumption are becoming new metrics of Automatic Speech Recognition (ASR) performance. For ASR systems using continuous Hidden Markov Models (HMMs), the computation of the state likelihood is one of the most time consuming parts. Hence, we propose in this paper novel multi-level Gaussian...

متن کامل

Efficient codebooks for fast and accurate low resource ASR systems

Today, speech interfaces have become widely employed in mobile devices, thus recognition speed and resource consumption are becoming new metrics of Automatic Speech Recognition (ASR) performance. For ASR systems using continuous Hidden Markov Models (HMMs), the computation of the state likelihood is one of the most time consuming parts. In this paper, we propose novel multi-level Gaussian selec...

متن کامل

Spoken Term Detection for Persian News of Islamic Republic of Iran Broadcasting

Islamic Republic of Iran Broadcasting (IRIB) as one of the biggest broadcasting organizations, produces thousands of hours of media content daily. Accordingly, the IRIBchr('39')s archive is one of the richest archives in Iran containing a huge amount of multimedia data. Monitoring this massive volume of data, and brows and retrieval of this archive is one of the key issues for this broadcasting...

متن کامل

A low-power hardware search architecture for speech recognition

High-performance speech recognition is extremely computationally expensive, limiting its use in the mobile domain. We therefore propose a low-power hardware speech recognition architecture for mobile applications, exploiting the orders-of-magnitude efficiency improvements dedicated hardware can offer. Our system is based on the Sphinx 3.0 software recognizer developed at Carnegie Mellon Univers...

متن کامل

Advances in Large Vocabulary Continuous Speech Recognition

The development of robust, accurate and efficient speech recognition systems is critical to the widespread adoption of a large number of commercial applications. These include automated customer service, broadcast news transcription and indexing, voice-activated automobile accessories, large-vocabulary voice-activated cellphone dialing, and automated directory assistance. This article provides ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013